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Abstract 



Functional data have been the subject of many research works over the last years. Functional 
regression is one of the most discussed issues. Specifically, significant advances have been made 
for functional linear regression models with scalar response. Let (H, (•, •)) be a separable Hilbcrt 
space. We focus on the model Y = (@,X) + b + e, where Y and e are real random variables, X is 
an "H -valued random element, and the model parameters b and are in K and "H, respectively. 
Furthermore, the error satisfies that E(e\X) = and E(e 2 \X) — a 2 < oo. A consistent bootstrap 
method to calibrate the distribution of statistics for testing H : = versus Hi : ^ is 
developed. The asymptotic theory, as well as a simulation study and a real data application 
illustrating the usefulness of our proposed bootstrap in practice, is presented. 

Keywords: Bootstrap; bootstrap consistency; functional linear regression; functional principal components 
analysis; hypothesis test. 

1 Introduction 

Nowadays, Functional Data Analysis (FDA) has turned into one of the most interesting statisti- 
cal fields. Particularly, functional regression models have been studied from a parametric point 
of view (see Ramsay and Silverman (2002, 2005)), and from a non-parametric one (see Ferraty 
and Vieu (2006)), being the most recent advances compiled on Ferraty and Romain (2011). This 
work focuses on the parametric approach, specifically, on the functional linear regression model with 
scalar response that is described below. 

Let (H, (•, •)) be a separable Hilbert space, and let || • || be the norm associated with its inner product. 
Moreover, let (f2, cx, P) be a probability space and let us consider (X, Y) a measurable mapping from 
Q to Ti x R, that is, X is an %-valued random element whereas Y is a real random variable. In 
this situation, let us assume that (X, Y) verifies the following linear model with scalar response, 



where 6 T~L is a fixed functional model parameter, b £ M. is the intercept term, and e is a real ran- 
dom variable such that E(e\X) = and E(e 2 \X) = a 2 < oo. Many authors have dealt with model 
(1), being the methods based on Functional Principal Components Analysis (FPCA) amongst the 
most popular ones to estimate the model parameters (see Cardot, Ferraty, and Sarda (1999, 2003), 
Cai and Hall (2006), Hall and Hosseini-Nasab (2006), and Hall and Horowitz (2007)). 

The main aim of this work is to develop a consistent general bootstrap resampling approach to 
calibrate the distribution of statistics for testing the significance of the relationship between X and 
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Y, that is, for testing Hq : = versus Hi : / 0, on the basis of a simple random sample 
{(Xi,Yi)}f =1 drawn from (X, Y). The bootstrap techniques will become an alternative useful tool 
when the asymptotics of test statistics are unknown or when they are inaccurate due to small sample 
size. 

Since its introduction by Efron (1979), it is well-known that the bootstrap method results in a 
new distribution approximation which can be applied to a large number of situations, such as the 
calibration of pivotal quantities in the finite dimensional context (see Bickel and Freedman (1981), 
and Singh (1981)). As far as multivariate regression models are concerned, bootstrap validity for 
linear and non-parametric models was also stated in literature (see Freedman (1981), and Cao- 
Abad (1991)). Currently, the application of bootstrap to the functional field has been successfully 
started. For instance, Cuevas, Febrero, and Fraiman (2006) have proposed bootstrap confidence 
bands for several functional estimators such as the sample and the trimmed functional means. In the 
regression context, Ferraty, Van Keilegom, and Vieu (2010), and Gonzalez-Manteiga and Martmez- 
Calvo (2011) have shown the validity of the bootstrap in the estimation of non-parametric functional 
regression and functional linear model, respectively, when the response is scalar. They have also 
proposed pointwise confidence intervals for the regression operator involved in each case. In addi- 
tion, the asymptotic validity of a componentwise bootstrap procedure has been proved by Ferraty, 
Van Keilegom, and Vieu (2012) when a non-parametric regression is considered and both response 
and regressor are functional. 

Bootstrap techniques can also be very helpful for testing purposes, since they can be used in order 
to approximate the distribution of the statistic under the null hypothesis Hq. For example, Cuevas, 
Febrero, and Fraiman (2004) have developed a sort of parametric bootstrap to obtain quantiles 
for an ANOVA test, and Gonzalez-Rodriguez, Colubi, and Gil (2012) have proved the validity of 
a residual bootstrap in that context. Hall and Vial (2006) and, more recently, Bathia, Yao, and 
Ziegelmann (2010) have studied the finite dimensionality of functional data using a bootstrap ap- 
proximation for independent and dependent data, respectively. 

As was indicated previously, testing the lack of dependence between X and Y is our goal. This 
issue has stirred up a great interest during the last years due to its practical applications in the 
functional context. For instance, Kokoszka, Maslova, Sojka, and Zhu (2008) proposed a test for lack 
of dependence in the functional linear model with functional response which was applied to magne- 
tometer curves consisting of minute-by-minute records of the horizontal intensity of the magnetic 
field measured at observatories located at different latitude. The aim was to analyse if the high- 
latitude records had a linear effect on the mid- or low-latitude records. On the other hand, Cardot, 
Prchal, and Sarda (2007) presented a statistical procedure to check if a real-valued covariate has an 
effect on a functional response in a nonparametric regression context, using this methodology for a 
study of atmospheric radiation. In this case, the dataset were radiation profiles curves measured at 
a random time and the authors tested if the radiation profiles changed along the time. 

Regarding the regression model (1), testing the significance of the relationship between a functional 
covariate and a scalar response has been the subject of recent contributions, and asymptotic ap- 
proaches for this problem can be found in Cardot, Ferraty, Mas, and Sarda (2003) or Kokoszka, 
Maslova, Sojka, and Zhu (2008). The methods presented in these two works are mainly based on 
the calibration of the statistics distribution by using asymptotic distribution approximations. In 
contrast, we propose a consistent bootstrap calibration in order to approximate the statistics dis- 
tribution. For that, we firstly introduce in Section 2 some notation and basic concepts about the 
regression model (1), the asymptotic theory for the testing procedure, and the consistency of the 
bootstrap techniques that we propose. In Section 3, the bootstrap calibration is presented as an 
alternative to the asymptotic theory previously exposed. Then, Section 4 is devoted to the empirical 
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results. A simulation study and a real data application allow us to show the performance of our 
bootstrap methodology in comparison with the asymptotic approach. Finally, some conclusions are 
summarized in Section 5. 

2 Asymptotic theory and bootstrap 

Let us consider the model (1) given in the previous Section 1. In this framework, the regression 
function, denoted by m, is given by 

m(x) = E(Y\X = x) = (0, x) + 6 for all x G U. 

The aim is to develop correct and consistent bootstrap techniques for testing 



on the basis of a random sample {(Xi,Yi)}f =1 of independent and identically distributed random 
elements with the same distribution as (X, Y). That is, our objective is to check whether X and Y 
are linearly independent (Hq) or not (H±). 

Next, we expose briefly some technical background required to develop the theoretical results pre- 
sented throughout the section. 

2.1 Some background 

Riesz Representation Theorem ensures that the functional linear model with scalar response can be 
handled theoretically within the considered framework. Specifically, let T~L be the separable Hilbert 
space of square Lebesgue integrable functions on a given compact set CcR, denoted by C 2 (C, A), 
with the usual inner product and the associated norm || • ||. The functional linear model with scalar 
response between a random function X and a real random variable Y is defined as 



where $ is a continuous linear operator (that is, G H' , being %' the dual space of H with norm 
|| • ||'), and e is a real random variable with finite variance and independent of X. In virtue of Riesz 
Representation Theorem T~L and T-C are isometrically identified, in such a way that for any <J> G T~L' 
there exists a unique O G % so that ||0|| = \\&\\' and $>(h) = (®,h) for all h G H. Consequently, 
the model presented in equation (3) is just a particular case of the one considered in (1). 

Previous works regarding functional linear models assume 6 = (see Cardot, Ferraty, Mas, and 
Sarda (2003), and Kokoszka, Maslova, Sojka, and Zhu (2008)). Of course, the intercept term can 
be embedded in the variable counterpart of the model as in the multivariate case as follows. Let 
% e be the product space % x R with the corresponding inner product (•, -) e , and define X' = (X, 1) 
and 0' = (0,6) G T~L e . Then the model considered in (1) can be rewritten as Y = (Q',X') e + e 
(and consequently X' cannot be assumed to be centered). Nevertheless, in the context of the linear 
independence test, the aim is to check if = or not, and this is not equivalent to checking whether 
0' = or not. In addition, in practice the intercept term 6 cannot be assumed to be equal to 0. Thus, 
in order to avoid any kind of confusion, in this paper the intercept term 6 has been written explicitly. 

In the same way, in the above mentioned papers, the random element X is assumed to be centered. 
Although, in many cases, the asymptotic distribution of the proposed statistics does not change if 
{Xi}™ =1 is replaced by the dependent sample {Xi — X}f =1 , the situation with the bootstrap version 
of the statistics could be quite different. In fact, as it will be shown afterwards, different bootstrap 
statistics could be considered when this replacement is done. Hence, for the developments in this 
section, it will not be assumed that the X variable is centered. 




(2) 



Y = <f>(X) + e, 



(3) 
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2.2 Linear independence test 

Given a generic %-valued random element H such that i?(||.£f|| 2 ) < oo, its associated covariance 
operator Th is defined as the operator Fh •.'H—t'H 

F H {h) =E{(H- fi H , h){H - hh)) = E ((H, h)H) - (fi H , h)fi H , 

for all h G H, where fin G tt denotes the expected value of H. From now on, it will be assumed 
that E'dlXy 2 ) < oo, and thus, as a consequence of Holder's inequality, E{Y 2 ) < oo. Whenever 
there is no possible confusion, Fx will be abbreviated as T. It is well-known that T is a nuclear 
and self-adjoint operator. In particular, it is a compact operator of trace class and thus, in virtue 
of the Spectral Theorem Decomposition, there is an orthonormal basis of H, {vj}j^, consisting on 
eigenvectors of T with corresponding eigenvalues {Aj}j e pj, that is, T(vj) = XjVj for all j G N. As 
usual, the eigenvalues are assumed to be arranged in decreasing order (Ai > A2 > • . .)• Since the 
operator T is symmetric and non-negative definite, then the eigenvalues are non-negative. 

In a similar way, let us consider the cross-covariance operator A : H — > K between X and Y given 
by 

A(h) = E({X- fi x , h)(Y - fi Y )) = E ((X, h)Y) - (fi x , h)fiy, 

for all h G H, where \iy G R denotes the expected value of Y . Of course, A G %' and the following 
relation between the considered operators and the regression parameter 6 is satisfied 

A(.) = <r(.),e>. (4) 

The Hilbert space T~L can be expressed as the direct sum of the two orthogonal subspaces induced 
by the self-adjoint operator T: the kernel or null space of T, M(T), and the closure of the image or 
range of T, K(F). Thus, 6 is determined uniquely by 6 = 61 + 6 2 with Q 1 G Af(T) and 9 2 G U(T). 
As 0i G A/"(r), it is easy to check that Var((X, 61)) = and, consequently, the model introduced 
in (1) can be expressed as 

Y = {e 2 ,X) + (e 1 ,fi x ) + b + e. 

Therefore, it is not possible to distinguish between the term (@i,fj,x) and the intercept term b, 
and consequently it is not possible to check whether Bi = or not. Taking this into account, the 
hypothesis test will be restricted to check 

Ho : @2 = 

H,: B 2 /0 (5j 
on the basis of the available sample information. 

Note that in this case, according to the relation between the operators and the regression parameter 
shown in (4), B 2 = if, and only if, A(h) =0 for all h G %. Consequently, the hypothesis test in 
(5) is equivalent to 

#o : ||A|r = 

Hl : HAII'/O (6) 

Remark 1. It should be recalled that, in previous works fix is assumed to be equal 0. Thus, 
the preceding reasoning leads to the fact that 61 cannot be estimated based on the information 
provided by X (see, for instance, Cardot, Ferraty, Mas, and Sarda (2003)). Consequently the 
hypothesis testing is also restricted to the one in the preceding equations. In addition in Cardot, 
Ferraty, Mas, and Sarda (2003), it is also assumed for technical reasons that TZ(T) is an infinite- 
dimensional space. On the contrary, this restriction is not imposed in the study here developed. 
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Remark 2. Note that another usual assumption is that the intercept term vanishes. Although this 
is not common in most of situations, it should be noted that if b = and A is not assumed to be 
centered (as in this work), then an interesting possibility appears: to check whether ©i = or not 
by checking the nullity of the intercept term of the model, and thus to check the original hypothesis 
testing in (2). This open problem cannot be solved with the methodology employed in the current 
paper (or in the previous ones) because the idea is based on checking (6), which is equivalent to the 
restricted test (5) but not to the unrestricted one in (2). 

2.3 Testing procedure and asymptotic theory 

According to the relation between || • ||' and || • ||, the dual norm of A £ %' can be expressed 
equivalently in terms of the % -valued random element (A — fix)(Y — my) as follows 

||A||' = \\(E ((A - - M y)) , -)\\' = \\E ((A - » X )(Y - my)) ||. 

Thus, based on an i.i.d. sample {(Aj,Yj)}™ =1 drawn from (X,Y), 

D=\\E{{X-h x ){Y-iiy))\\ = \\T\\ 

can be estimated in a natural way by means of its empirical counterpart D n = \\T n \\, where T n is 
the %-valued random element given by 

1 n 

T n = -Y J {X i -X){Y i -Y), 

n i=i 

where A and Y denote as usual the corresponding sample means. The next theorem establishes 
some basic properties of T n . 

Theorem 1. Assuming that (1) holds with E(e) = 0, E(e 2 ) = a 2 < oo and I?(||A|| 4 ) < oo, then 

1. E(T n ) = E ((A — nx){Y - fi Y )) (n - l)/n 

2. T n converges a.s.-P to E ((A — /jLx)(X — My)) «s n — > oo 

3. y/n(T n — E ((A — Hx)(Y — fJ,y))) converges in law, as n — > oo, to a centered Gaussian ele- 
ment Z in H with covariance operator 

T Z (-) = a 2 r(-) + E ((A — mx)(A - » x , -)(X - » x , O) 2 ) . 



Proof. Since T n can be equivalently expressed as 

1 n 

Tn = n ~ ^(Yi - fly) - (A - n x ){Y - fly), 

i=l 

it is straightforward to check item 1. The a.s.-P convergence is a direct application of the SLLN 
for separable Hilbert-valued random elements. 

On the other hand, given that P(||(A — /j,x)(Y — ^y)\\ 2 ) < oo, the convergence in law can be 
deduced by applying the CLT for separable Hilbert-valued random elements (see, for instance, Laha 
and Rohatgi (1979)) together with Slutsky's Theorem. The concrete expression of the operator 
T z , that is, T z = r (x _ Mx)(y _^ y) = T {X -^ x )e + ^(x-n x ){x-n x ,e), can be obtained by simple 
computations. □ 

In order to simplify the notation, from now on, given any ^-valued random element H with 
Pdl-H"!! 2 ) < oo, Zh will denote a centered Gaussian element in % with covariance operator r#. 
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Corollary 1. Under the conditions of Theorem 1, if the null hypothesis Hq : \\A\\' = is satisfied, 
then \JnT n converges in law to Z^ X - lJ , x )e ( w ^ n covariance operator a 2 T ) , and consequently, \\y/nT n \\ 
converges in law to \\Z(x-n x )e\\- 

In contrast to Theorem 1 in Cardot, Ferraty, Mas, and Sarda (2003), the result in Corollary 1 is 
established directly on the Hilbert space H instead of on its dual space. In addition, no assumption 
of centered X random elements or null intercept term is necessary. Nevertheless these two assump- 
tions could be easily removed in that paper in order to establish a dual result of Corollary 1. 



Furthermore, in view of Corollary 1, the asymptotic null distribution of ||-^/nT n || is not explicitly 
known. This is the reason why no further research on how to use in practice this statistic (or its 
dual one) for checking if 62 equals is carried out in Cardot, Ferraty, Mas, and Sarda (2003). 
Instead, an alternative statistic that is used in the simulation section for comparative purposes is 
considered. Nevertheless, it is still possible to use H-^/nTnH as a core statistic in order to solve this 
test in practice by means of bootstrap techniques. 



One natural way of using the asymptotic result of Corollary 1 for solving the test under study 
is as follows. Consider a consistent (at least under Hq) estimator a 2 of a 2 (for instance, the 
sample variance of Y could be used, or perhaps the one introduced by Cardot, Ferraty, Mas, and 
Sarda (2003), provided that its theoretical behavior is analyzed). Then, according to Slutsky's 
Theorem ||\/n7^||/(7 n converges in law under Hq to the norm of Zx- As its covariance operator T is 
unknown, it can be approximated by the empirical one T n . And thus, \\Zx\\ can be approximated 
by \\Z n \\, being Z n a centered Gaussian element in H with covariance operator T n . Of course, the 
distribution of \\Z n \\ is still difficult to compute directly, nevertheless one can make use of the CLT 
and approximate its distribution by Monte Carlo method by the distribution of 



1 rn 



i=l 



for a large value of m, being {X*}^ =1 i.i.d 



population (X\ 



random elements chosen at random from the fixed 
, X n ). Obviously, this method is a precursor of the bootstrap procedures. 



In order to complete the asymptotic study of the statistic H-^/nTnH, its behavior under local alter- 
natives is going to be analyzed. To this purpose, let us consider O G % so that 1 1 ©2 1 1 > 0, and given 
5 n > consider the modified random sample 



Y t n = (X h -5=0) + b + e 



n 



for all i G {l,...,n}. Then, the null hypothesis is not verified. However, if bnj^fn — > 0, then 
IK'Wv 7 "-)®!! ~~ > 0> that is, Ho is approached with "speed" 5 n /^/n. In these conditions, 

E{{x i - m )(Y?-ry r )) = ^r{e), 



n 



and thus the following theorem that establishes the behavior of the statistic under the considered 
local alternatives can be easily deduced. 

Theorem 2. Under the conditions of Theorem 1 and with the above notation, if 8 n — > 00 and 
<W \fn — > as n — ^ 00 then 



P 







as n ^ co, for all t G 
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2.4 Bootstrap procedures 



The difficulty of using the previously proposed statistic to solve the hypothesis test by means of 
asymptotic procedures suggests the development of appropriated bootstrap techniques. The asymp- 
totic consistency of a bootstrap approach is guaranteed if the associated bootstrap statistic converges 
in law to a non-degenerated distribution irrespectively of Hq being satisfied or not. In addition, in 
order to ensure its asymptotic correctness, this limit distribution must coincide with the asymptotic 
one of the testing statistic provided that Hq holds. 

Consequently, the asymptotic limit established in Corollary 1 plays a fundamental role for defining 
appropriate bootstrap statistics. In this way, recall that 

1 n 

— £ (( Xi - X) (Y t -Y)-E{{X- „ X )(Y - p Y ))) 
vn i=l 

converges in law to Zix- ltx \(y-ii Y )i irrespectively of Hq being satisfied or not and, in addition, if 
Hq is satisfied then I" \x-n x )(Y-n Y ) = a2 ^- Thus, this is a natural statistic to be mimicked by a 
bootstrap one. Note that, 

converges in law to (a 2 +E({X-/j, x , Q) 2 ))Z X , whose covariance operator is (a 2 +E((X — fi x , Q) 2 )T. 
Particularly, when Hq is satisfied, this operator reduces again to a 2 F. Consequently, another pos- 
sibility consists in mimicking this second statistic by means of a bootstrap one, improving the 
approximation suggested in the previous subsection. Note that the left term in the product in 
equation (7) could be substituted by any other estimator under Hq of a 2 that converges to a finite 
constant if Hq does not hold. Anyway, this second approximation could lead to worst results under 
the null hypothesis, because the possible dependency between X and e is lost (as the resample 
would focus only on the X information). 

Two possibilities for mimicking the statistics which were above-mentioned are going to be explored, 
namely a "naive" paired bootstrap and a "wild" bootstrap approach. In order to achieve this goal, let 
{(X* , Y*)}f =1 be a collection of i.i.d. random elements drawn at random from (Xi,Yi), . . . , (X n , Y n ), 
and let us consider the following "naive" paired bootstrap statistic 

1 n 

i=l 

In addition, let us consider o 2 n = (1/n) E™=i( y * ~K> 2 and a *n = (V n ) ELiC^* ~ ^*) 2 ' the empir- 
ical estimator of a\ under #o and its corresponding bootstrap version. 

The asymptotic behavior of the "naive" bootstrap statistic will be analyzed through some results 
on bootstrapping general empirical measures obtained by Gine and Zinn (1990). It should be noted 
that the bootstrap results in that paper refer to empirical process indexed by a class of functions J 7 , 
that particularly extend to the bootstrap about the mean in separable Banach (and thus Hilbert) 
spaces. In order to establish this connection, it is enough to choose 

^={/GH'ni/ir<i} 

(see Gine (1997) and Kosorok (2008), for a general overview of indexed empirical process). T is 
image admissible Suslin (considering the weak topology). In addition, F(h) = supj eJr \ f(h)\ = \\h\\ 
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for aRh€H and thus E{F 2 {X)) = E{\\X\\ 2 ) < oo. 



Consider the bounded and linear (so continuous) operator 5 from % to l°°(JF) given by 5(h)(f) = 
5 h (f) = f(h) for all h G H and all / G T and denote by R(S) C Z°°(P) its range. As ||(J(/i)||oo = INI 
for all /i G % then, there exists <5 _1 : R(5) — >■ "H, so that <5 -1 is continuous. In addition, as R(S) 
is closed, Dugundji Theorem allows us to consider a continuous extension S^ 1 : l°°(P) — > % (see 
for instance Kosorok (2008), Lemma 6.16 and Theorem 10.9). Thus, following the typical empirical 
process notation, the empirical process (l/^/n) Y17=i(^Xi — IP) indexed in T is directly connected 
with (1/y/n) Yl?=i(-^i ~ ^(-^0) by means of the continuous mapping <5 _1 and vice-versa. 

Some consequences of this formulation applied to the work developed by Gine and Zinn (1990) lead 
to the results collected in following lemma. 

Lemma 1. Let £ be a measurable mapping from a probabilistic space denoted by (Cl,o~,P) to a 
separable Hilbert space (7-1, (■, •)) with corresponding norm || • || so that P(||,$;|| 2 ) < oo. Let {£«}™ =1 be 
a sequence of i.i.d. random elements with the same distribution as £, and let {£*}" =1 be i.i.d. from 



2. converges in probability to E{£) a.s.-P 

3. \\C*\\ 2 converges in probability to P(||£|| 2 ) a.s.-P 

Proof. To prove item 1 note that the CLT for separable Hilbert-valued random elements (see, 
for instance, Laha and Rohatgi (1979)) together with the Continuous Mapping Theorem applied 
to 5 guarantees that P G CLT(P). Thus, Theorem 2.4 of Gine and Zinn (1990) ensures that 
n l l 2 (P n (w) — P n (w)) converges in law to a Gaussian process on P, G = S(Z^) a.s.-P. Consequently, 
by applying again the Continuous Mapping Theorem y/n(^* — £) = S~ 1 (n 1 ^ 2 (P n (w) — P n (w))) con- 
verges in law to = 5~ 1 {G). 

Items 2 and 3 can be checked in a similar way by applying Theorem 2.6 of Gine and Zinn (1990). 
Note that item 1 is also a direct consequence of Remark 2.5 of Gine and Zinn (1990); nevertheless 
it was proven based on Theorem 2.4 to illustrate the technique. □ 

The following theorem establishes the asymptotic consistency and correctness of the "naive" boot- 
strap approach. 

Theorem 3. Under the conditions of Theorem 1, we have that ^/nT^* converges in law to 
Z(x-hx)(Y-hy) a - s -~P- I n addition, a* 2 converges in probability to Uy = a 2 + E((X — /ix,9} 2 ) 
a.s.-P. 

Proof. First of all consider the bootstrap statistic 



{&}?=!■ 



Then 



1 



\/n(^* — £) converges in law to Z^ a.s.-P 



n 



and note that {(X* — fix) (Y* — /Uy)}™ =1 are i.i.d. % -valued random elements chosen at random 
from the "bootstrap population" {(Xi — fix) (Yi — A*y)}"=i- Then, item 1 in Lemma 1 guarantees 
that S* converges in law to Z( X -n x )(Y-n Y ) a.s.-P. 
On the other hand, 5* equals ^/nT^* plus the following terms 
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Items 1 and 2 in Lemma 1, together with Slutsky's Theorem, ensure that these three terms converge 
in probability to a.s.-P, and consequently the convergence in law stated in the theorem is proven. 

Finally, the convergence of a* 2 holds in virtue of items 2 and 3 in Lemma 1. □ 
The "naive" bootstrap approach is described in the following algorithm. 
Algorithm 1 (Naive Bootstrap). 
Step 1. Compute the value of the statistic T n (or the value T n /a n ). 

Step 2. Draw {(X* ,Y*)}™ =1 , a sequence of i.i.d. random elements chosen at random from the 
initial sample (Xi,Y±), ... , (X n ,Y n ), and compute a n = || (or b n = \\T^ r *\\/a^ l ). 

Step 3. Repeat Step 2 a large number of times B G N in order to obtain a sequence of values 

Klf=i (or m? =l ). 

Step 4- Approximate the p-value of the test by the proportion of values in {a l n }? 1 greater than or 
equal to \\T n \\ (or by the proportion of values in {b l n }f =1 greater than or equal to \\T n \\/a n ) 

Analogously, let {£*}™ =1 be i.i.d. centered real random variables so that -E((e*) 2 ) = 1 and 
fo°(.P(\ £ *\ > *) 1 ^ 2 ) < 00 (t° guarantee this last assumption, it is enough that E((e*) d ) < oo for 
certain d > 2), and consider the "wild" bootstrap statistic 

i=i 

In order to analyze the asymptotic behavior of the "wild" bootstrap statistic, the following lemma 
will be fundamental. It is a particularization of a result due to Ledoux, Talagrand and Zinn (cf. 
Gine and Zinn (1990), and Ledoux and Talagrand (1988)). See also the Multiplier Central Limit 
Theorem in Kosorok (2008) for the empirical process indexed by a class of measurable functions 
counterpart. 

Lemma 2. Let £ be a measurable mapping from a probabilistic space denoted by (Q,a,P) to a 
separable Hilbert space (H, (■, •)) with corresponding norm || • || so that E(\\^\\ 2 ) < oo. Let {£«}™ =1 be 
a sequence of i.i.d. random elements with the same distribution as £, and let {Wi}f =1 be a sequence 
of i.i.d. random variables (in the same probability space and independent of{£i}f =1 ) with E{Wi) = 
and $^(P(\W\\ > t) 1 / 2 ) < oo, then the following are equivalent 

1. -E(||£|| 2 ) < oo (and consequently \/n(^ — E(£)) converges in law to Z^). 

2. For almost all u G $7, (l/^/n) YLl=i ^i?i( w ) converges in law to Z^. 

As a consequence, the asymptotic consistency and correctness of the "wild" bootstrap approach is 
guaranteed by the following theorem. 

Theorem 4. Under the conditions of Theorem 1, we get that y/nT^* converges in law to 
Z(x-px)(Y-n Y ) a.s.-P. 

Proof. According to Lemma 2, for almost all 

n 

vn i=i 

converges in law to Z^x-^ x ){Y-n Y ) m Moreover (Y w — jiy) and (X w — fix) converges to (by SLLN). 
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Finally note that, for almost all oj €. CI, 

S* n = v^Tf * + (Y*- ^ Y )±= J2(X? - » x )e* 



-i n- 1 n 

v 4 = 1 v «=1 

Lemma 2, together with the SLLN above-mentioned, guarantees the convergence in probability to 
of the last three summands, and thus the result is reached in virtue of Slutsky's Theorem. □ 

The "wild" bootstrap approach proposed can be applied by means of the following algorithm. 

Algorithm 2 (Wild Bootstrap). 

Step 1. Compute the value of the statistic T n (or the value T n /a n ). 

Step 2. Draw {e*}f =1 a sequence of i.i.d. random elements e, and compute a n = (or 
b n = \\T^* ||/(7* , in this case o* n is computed like in Step 2 of the Naive Bootstrap algorithm). 

Step 3. Repeat Step 2 a large number of times B G N in order to obtain a sequence of values 
Klf=i (or {b l n }f =l ). 

Step 4- Approximate the p-value of the test by the proportion of values in {o} n }f =l greater than or 
equal to \\T n \\ (or by the proportion of values in {b l n }f =1 greater than or equal to \\T n \\/o~ n ). 

3 Bootstrap calibration vs. asymptotic theory 

For simplicity, suppose from now on that b = and X of zero-mean in (1), that is, suppose that 
the regression model is given by 

Y = (0,X)+e. 

Furthermore, A(h) = E ((X, h)Y) and, analogously, T(h) = E ((X, h)X). In such case, if we assume 
that J2?=i (A(«j)/Aj) 2 < +oo and Ker{T) = {0}, then 

j=i j 

being {(Xj, Vj)}j e n the eigenvalues and eigenfunctions of T (see Cardot, Ferraty, and Sarda (2003)). 
A natural estimator for 6 is the FPCA estimator based on k n functional principal components given 

by 

&k n = ±^—»- J -v j , 

where A n is the empirical estimation of A, that is, A n (h) = (l/n) Y17=i (-^»> h)Yi, and {{Xj, %)}jeN 
are the eigenvalues and the eigenfunctions of Y n , the empirical estimator of V: T n {h) = (l/n) 

Y2=i(Xi,h)Xi. 

Different statistics can be used for testing the lack of dependence between X and Y. Bearing in 
mind the expression (5), one can think about using an estimator of ||6|| 2 = Yl'jLi (^-( v j) I ^j) 2 m 
order to test these hypotheses. In an alternative way, the expression (6) can be a motivation for 
different class of statistics based on the estimation of II A II'. 
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One asymptotic distribution free based on the latter approach was given by Cardot, Ferraty, Mas, 
and Sarda (2003). They proposed as test statistic 

Ti, n = k- 1 ' 2 (V 2 ||v^A n i n || /2 - k n ) , (8) 

where !„(■) = £-=i XJ 1/2 (- ,Vj)vj and a 2 is a consistent estimator of a 2 . Cardot, Ferraty, Mas, 
and Sarda (2003) showed that, under Hq, T\^ n converges in distribution to a centered Gaussian 
variable with variance equal to 2. Hence, Hq is rejected if \T\^ n \ > V / 2-2i- Q /2 ( z a the a-quantile of 
a A/"(0, 1)), and accepted otherwise. Besides, Cardot, Ferraty, Mas, and Sarda (2003) also proposed 
another calibration of the statistic distribution based on a permutation mechanism. 



On the other hand, taking into account that ||0|| 2 = J^JLi ( /Aj ) 2 , one can use the statistic 
which limit distribution is not known. 



Finally, a natural competitive statistic is the one proposed throughout Section 2.3 



'-3,n 



i=l 



(10) 



which we will denote by "F-test" from now on since it is the natural generalization of the well- 
known F-test in the finite-dimensional context. Another possibility is to consider the studentized 
version of (10) 

1 



T3s,n — 



a 



-^(Xi-XXX-Y) 



i=l 



(11) 



where a is the empirical estimation of 



In general, for the statistics such as (8), (9), (10) and (11), the calibration of the distribution 
can be obtained by using bootstrap. Furthermore, in the previous section, "naive" and "wild" 
bootstrap were shown to be consistent for the F-test, that is, the distribution of Tz, n and T^ S)n 
can be approximated by their corresponding bootstrap distribution, and Hq can be rejected when 
the statistic value does not belong to the interval defined for the bootstrap acceptation region of 
confidence 1 — a. The same calibration bootstrap can be applied to the tests based on T\^ n and 
T2,m although the consistence of the bootstrap procedure in this cases have not been proved in this 
work. 



4 Simulation and real data applications 

In this section a simulation study and an application to a real dataset illustrate the performance of 
the asymptotic approach and the bootstrap calibration from a practical point of view. 



4.1 Simulation study 

We have simulated ns = 500 samples, each being composed of n 6 {50, 100} observations from 
the functional linear model Y = (Q,X) + e, being X a Brownian motion and e ~ AA(0,a 2 ) with 
signal-to-noise ratio r = a/y/E((X, O) 2 ) € {0.5, 1, 2}. 
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Under Hq, we have considered the model parameter @o(t) = 0, t € [0,1], whereas under Hi, the 
selected model parameter was @i(t) = sin(27ri 3 ) 3 , t G [0, 1]. Furthermore, under Hq we have chosen 
cr = l, while in the alternative i?i we assigned the three different values that were commented 
before. Let us remark that both X and were discretized to 100 equidistant design points. 

We have selected the statistical tests which were introduced in the previous section: (8), (9), (10) 
and (11). For (8), three distribution approximations were considered: the asymptotic approach 
(A/"(0, 2)) and the following two bootstrap calibrations 



*(o) _ 



l.n 



r*( b ) 

- 1,71 



j'=i 



A; 



kr, 



kn 



a 2 



E 



A; 



The difference between the two proposed bootstrap approximations is that in the former the esti- 
mation of a 2 is also bootstrapped in each iteration. On the other hand, for (9), (10) and (11), only 
the bootstrap approaches were computed 



L 2m 



L 3,n 



L 3s,n 



E 



A*(%) 



j'=i 



n 
1 



1 n 

-^(^-x)(y,-y) £ 
=i 

i n 

-^(i t -i)(y { -7K 



i=i 



For this simulation study, we have used the "wild" bootstrap algorithm introduced in Section 2.4 
for the F-test and its studentized version, and the following adaptation of this consistent "wild" 
bootstrap for Ti j7l and T2,n- 

Algorithm 3 (Wild Bootstrap). 

Step 1. Compute the value of the statistic T\, n (or the value T2 >n ). 



Step 2. Draw {e*}f =1 a sequence of i.i.d. random elements e, and define Y* = 
i = 1, . . . , n. 

Step 3. Build A* (•) = n~ l Ya=i (-^' an< ^ compute a n = |T* n | (or b n = 11% n \). 



Yi£* for all 



Step 4- Repeat Steps 2 and 3 a large number of times B € N in order to obtain a sequence of 
values {a l n }f =l (or {b l n }f =1 ). 

Step 5. Approximate the p-value of the test by the proportion of values in {a l n }fLi greater than or 
equal to \Ti jfl \ (or by the proportion of values in {b l n }f =1 greater than or equal to \T2 %n \)- 

Let us indicate that 1, 000 bootstrap iterations were done in each simulation. 

Due to k n and a must be fixed to run the procedure, the study was repeated with different numbers 
of principal components involved (k n 6 {1,...,20}) and confidence levels (a G {0.2,0.1,0.05,0.01}). 
Nevertheless, in order to simplify the reading, the information collected in the following tables cor- 
responds to only three of the values of k n which were analyzed: k n = 5, k n = 10 and k n = 20. 
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AA(0,2) 




rp*(a) 

1 l,n 






T *(b) 

1 l,n 






rp* 

1 2,n 








n 


a 


5 


k„ 
10 


20 


5 


k n 
10 


20 


5 


kn 

10 


20 


5 


k n 

10 


20 


rp* 

1 3,n 


Ss,n 




20% 


19.4 


17.6 


16.0 


21.4 


21.6 


20.0 


21.6 


19.0 


15.2 


19.8 


20.8 


18.4 


21.6 


20.8 


50 


10% 


10.8 


10.4 


8.2 


9.0 


10.8 


10.6 


8.0 


7.2 


3.2 


8.6 


7.2 


7.2 


11.8 


11.2 


5% 


8.2 


7.0 


4.4 


5.0 


4.0 


4.6 


5.0 


2.4 


0.0 


4.0 


3.2 


3.0 


6.0 


6.2 




1% 


4.8 


4.2 


2.2 


1.2 


0.4 


0.0 


0.6 


0.0 


0.0 


0.2 


0.6 


0.4 


0.6 


1.2 




20% 


15.0 


19.4 


20.0 


20.8 


21.0 


19.0 


21.0 


20.8 


18.0 


21.4 


19.4 


17.6 


21.6 


21.2 


100 


10% 


8.6 


9.6 


9.0 


11.8 


10.8 


10.4 


10.4 


9.6 


6.2 


9.8 


8.8 


7.0 


11.6 


11.8 


5% 


5.6 


5.2 


4.0 


4.4 


4.6 


3.6 


3.6 


3.4 


2.2 


4.6 


5.2 


2.8 


5.6 


5.6 




1% 


2.6 


2.4 


1.2 


1.4 


1.2 


0.8 


1.2 


0.6 


0.2 


1.0 


0.6 


0.8 


0.4 


0.4 



Table 1: Comparison of the estimated levels for T l n (using the asymptotic distribution A/"(0, 2) and the 
bootstrap distributions of T*^ and T*^), T 2 ,„ (using the bootstrap distribution of T 2 *„), T 3j „ (using the 
bootstrap distribution of T£ n ) and its studentized version, T 3s , n (using the bootstrap distribution of Tg sn ). 







AA(0,2) 






rp*(a) 

± l,n 






T *(b) 
± l,n 




1 2,n 








n 


a 




kn 






kn 






kn 






kn 




rp* 
1 3 > n 


1 3s,n 






5 


10 


20 


5 


10 


20 


5 


10 


20 


5 


10 


20 








20% 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


88.8 


0.0 


0.0 


100.0 


100.0 


50 


10% 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


60.8 


0.0 


0.0 


100.0 


100.0 


5% 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


99.0 


32.2 


0.0 


0.0 


100.0 


100.0 




1% 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


99.4 


51.4 


3.4 


0.0 


0.0 


99.4 


100.0 




20% 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


1.0 


0.0 


100.0 


100.0 


100 


10% 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


0.0 


0.0 


100.0 


100.0 


5% 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


98.4 


0.0 


0.0 


100.0 


100.0 




1% 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


70.0 


0.0 


0.0 


100.0 


100.0 



Table 2: For r — 0.5, comparison of the empirical power for Ti.„ (using the asymptotic distribution Af(0, 2) 
and the bootstrap distributions of T*^ and T*^), T 2 , n (using the bootstrap distribution of T| n ), T 3j „ 
(using the bootstrap distribution of T|„) and its studentized version, T 3s n (using the bootstrap distribution 

°fT 3 * Si J. 



Table 1 on page 13 displays the sizes of the test statistics obtained in the simulation study. For 
Ti 5 „, it can be highlighted that bootstrap approaches have closer sizes to the theoretical a than 
the asymptotic approximation for Ti j7l , mainly when k n is small. If we compare the performance 

of the two bootstrap procedures proposed, it seems that if a 1 is bootstrapped (T*^) the results 
are better than if the same estimation of the variance is considered in all the bootstrap replications 
(I^f) above all when k n is large. As far as Ti, n is concerned, the estimated levels are quite near to 
the nominal ones, being k n = 20 the case in which they are farther from the theoretical a. Finally, 
it must be remarked that the F-test and its studentized versions also get good results in terms 
of test levels, which are slightly closer to a when one uses the bootstrap distribution of Tg* to 
approximate the distribution of the statistic. 

On the other hand, Table 2 on page 13, Table 3 on page 14, and Table 4 on page 14 show the 
empirical power obtained with the different procedures for each considered signal-to-noise ratio r. 
In terms of power, when r = 0.5 the results for all the methods are similar, except for T2 5 „ for which 
the empirical power decreases drastically, above all when k n increases (this effect is also observed 
for r = 1 and r = 2). This fact seems to be due to the construction of T2 5 „ since this test statistic 
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is the only one which does not involve the estimation of a 2 . In addition, the power of T\, n also falls 
*(b) 

abruptly when T\„ is considered, n is small and k n is very large. 

A similar situation can be observed when r = 1 and r = 2. In the latter it can be seen that the 
empirical power is smaller for all the methods in general, being obtained an important loss of power 
when the sample is small (n = 50), and k n increases and/or a decreases (see Table 4 on page 14). 
Furthermore, in this case, it can be seen that the empirical power relies heavily on the selected k n 
value. Hence, the advantage of using T^^ or T3 Sjn is that they do not require the selection of any 
parameter and they are competitive in terms of power. Nevertheless, it also seems that an adequate 
k n selection can make T\ n obtain larger empirical power than T%^ n or T^s n in some cases. 







AA(0,2) 






rp*(a) 

± l,n 






T *(b) 
± l,n 






7^* 
J 2,n 








n 


a 




kn 






kn 






kn 






kn 




1 3,n 


3s,n 






5 


10 


20 


5 


10 


20 


5 


10 


20 


5 


10 


20 








20% 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


98.2 


66.6 


3.6 


0.2 


100.0 


100.0 


50 


10% 


100.0 


100.0 


100.0 


100.0 


100.0 


99.8 


100.0 


99.8 


89.6 


33.6 


0.8 


0.0 


100.0 


100.0 


5% 


100.0 


100.0 


99.8 


100.0 


100.0 


99.6 


100.0 


99.0 


59.6 


16.6 


0.2 


0.0 


99.2 


99.2 




1% 


100.0 


100.0 


99.6 


99.6 


97.6 


94.6 


95.2 


67.6 


2.6 


2.2 


0.0 


0.0 


87.8 


92.4 




20% 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


97.0 


7.8 


0.0 


100.0 


100.0 


100 


10% 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


86.4 


2.2 


0.0 


100.0 


100.0 


5% 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


67.8 


1.0 


0.0 


100.0 


100.0 




1% 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


99.8 


21.6 


0.2 


0.0 


100.0 


100.0 



Table 3: For r = 1, comparison of the empirical power for T i n (using the asymptotic distribution A/"(0, 2) 
and the bootstrap distributions of T*^ and T*^), T 2 , n (using the bootstrap distribution of T^ n ), T 3j „ 
(using the bootstrap distribution of T|„) and its studentized version, T 3s „ (using the bootstrap distribution 
°fT 3 * Si J. 







AA(0,2) 




rp*[a) 

1 l,n 






T *(b) 
± l,n 






1 2,n 








n 


a 




k n 






k n 






k n 






k n 




J 3,n 


1 3s,n 






5 


10 


20 


5 


10 


20 


5 


10 


20 


5 


10 


20 








20% 


85.4 


75.6 


66.8 


89.0 


81.2 


77.2 


89.0 


76.8 


51.4 


34.0 


11.8 


7.2 


90.4 


89.8 


50 


10% 


80.0 


68.6 


56.4 


79.4 


68.6 


59.4 


76.4 


57.4 


20.2 


16.6 


4.0 


2.4 


79.0 


79.0 


5% 


74.4 


62.2 


48.4 


67.4 


51.6 


43.6 


60.8 


37.8 


6.2 


10.4 


1.0 


0.4 


67.8 


67.2 




1% 


67.4 


51.4 


35.6 


40.0 


26.4 


20.2 


25.4 


6.0 


0.0 


0.8 


0.0 


0.0 


34.4 


39.0 




20% 


99.8 


98.8 


94.6 


100.0 


99.8 


98.0 


100.0 


99.2 


94.2 


60.0 


14.6 


7.6 


99.8 


99.8 


100 


10% 


99.6 


96.6 


91.2 


99.6 


97.2 


93.6 


99.6 


96.0 


82.4 


34.2 


6.2 


2.0 


97.8 


97.4 


5% 


99.6 


95.6 


85.8 


97.8 


94.0 


85.8 


97.2 


90.4 


64.6 


18.0 


2.8 


0.4 


94.4 


94.4 




1% 


97.6 


91.4 


75.4 


88.2 


76.4 


64.0 


85.2 


63.4 


26.2 


2.2 


0.8 


0.0 


79.2 


82.4 



Table 4: For r = 2, comparison of the empirical power for T\. n (using the asymptotic distribution A/"(0, 2) 

and the bootstrap distributions of T*^ and T*^), T 2 ,„ (using the bootstrap distribution of TJn)' ^3,« 
(using the bootstrap distribution of T|„) and its studentized version, T 3s n (using the bootstrap distribution 

4.2 Data application 

For the real data application, we have obtained concentrations of hourly averaged NO x in the neigh- 
borhood of a power station belonging to ENDESA, located in As Pontes in the Northwest of Spain. 
During unfavorable meteorological conditions, NO^ levels can quickly rise and cause an air-quality 
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episode. The aim is to forecast NOa; with half an hour horizon to allow the power plant staff to 
avoid NO x concentrations reaching the limit values fixed by the current environmental legislation. 
This fact implies that it is necessary to estimate properly the regression model which defines the 
relationship between the observed NOa; concentration in the last minutes (X) and the NO^ concen- 
tration with half an hour horizon (1"). For that, a first step is to determine if there exists a linear 
dependence between X and Y. 

Therefore, we have built a sample where each curve X corresponds to 240 consecutive minutal val- 
ues of hourly averaged NO x concentration, and the response Y corresponds to the NO^ value half 
an hour ahead (from Jan 2007 to Dec 2009). Applying the tests for dependence to the dataset, the 
null hypothesis is rejected in all cases (thus, there is a linear relationship between the variables), 
except for T2 5 „ when k n is large (see Table 5 on page 15). Nevertheless, as we have commented in 
the simulation study, this test statistic does not take into account the variance term and its power 
is clearly lower than the power of the other tests. 



AA(0,2) 




rp*(a) 

1 l,n 




1 l,n 




1 2,n 








k n 




k n 




k n 




k n 




1 3,n 


1 3s,n 


1 5 


10 


1 5 


10 


1 5 


10 1 


5 


10 






0.000 0.000 


0.000 


0.000 0.000 


0.000 


0.000 0.000 


0.000 0.000 


0.002 


0.011 


0.000 


0.000 



Table 5: Real data application. P-values for Ti ; „ (using the asymptotic distribution Af(0,2) and the 
bootstrap distributions of T*^ and T*^), T 2 . n (using the bootstrap distribution of T 2 *„), T 3i „ (using the 
bootstrap distribution of T 3 *„) and its studentized version, T 3s , n (using the bootstrap distribution of Tg s n ). 



5 Final comments 

The proposed bootstrap methods seems to give test sizes closer to the nominal ones than the tests 
based on the asymptotic distributions. In terms of power, the statistic tests which include a consis- 
tent estimation of the error variance a 2 are better that the tests which do not take it into account. 
Furthermore, in all the suitable choice of k n seems to be quite important, and currently it 

is still an open question. 

Besides of the optimal k n selection, other issues related to these dependence tests require further 
research, such as their extension to functional linear models with functional response. On the other 
hand, and in addition to the natural usefulness of this test, if would be interesting to combine it with 
the functional AN OVA test (see Cuevas, Febrero, and Fraiman (2004), and Gonzalez-Rodriguez, 
Colubi, and Gil (2012)) in order to develop an ANCOVA test in this context. 
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